Nucleic Acids Research
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match Nucleic Acids Research's content profile, based on 1128 papers previously published here. The average preprint has a 0.80% match score for this journal, so anything above that is already an above-average fit.
Wolfram-Schauerte, M.; Trust, C.; Waffenschmidt, N.; Nieselt, K.
Show abstract
Time-resolved transcriptomic profiling has been used to study phage-host interactions for more than a decade. However, the resulting datasets are not readily accessible for custom re-analysis, and resources are lacking that provide standardized processing, storage, and analysis of transcriptomes from phage infections. Here, we present the PhageExpressionAtlas, the first bioinformatics resource for storing time-resolved dual RNA-sequencing data from phage infections. This data was processed uniformly using a custom analysis pipeline and is presented for interactive exploration through visualisation. The PhageExpressionAtlas currently hosts 42 datasets from 23 studies. Using the PhageExpressionAtlas, we replicate key findings from original publications and extend hypothesis testing across multiple phage-host systems. By systematically querying and analyzing the underlying database, we evaluate approaches to phage gene classification and show that uncharacterized phage genes are expressed across all infection phases. Moreover, we provide a comprehensive view of the expression dynamics of anti-phage defenses as well as host- and phage-encoded anti-defense systems in the infection context, indicating unique and conserved patterns of transcriptional regulation underlying bacterial anti-phage immunity and phage counter-strategies. Together, the PhageExpressionAtlas is a unifying resource that democratizes transcriptomics-driven analyses of phage-host interactions and supports integrative cross-study assessment.
Domingues-Silva, B.; Azzalin, C. M.
Show abstract
Mammalian telomeric DNA comprises long tracts of tandem TTAGGG repeats. The same repeats are also found at internal chromosomal regions called interstitial telomeric sequences (ITSs). Telomeres are transcribed into UUAGGG-containing transcripts, named TERRA, which serve multiple functions in maintaining telomere integrity. Complementary RNAs containing C-rich telomeric repeats, named ARIA, have also been identified in few yeast mutants and mammalian cells with dysfunctional telomeres. The molecular features and functions of ARIA remain understudied, mainly due to its low abundance and the lack of suitable cellular systems. Here, we show that Chinese hamster ovary (CHO) cells produce abundant TERRA and ARIA transcripts, predominantly originating from ITSs. Both RNAs are polyadenylated, exhibit relatively short half-lives and form large cellular foci. We also show that ARIA depletion leads to exposure of single-stranded (ss) DNA at ITSs and that ssDNA exposure increases when ITS DNA is damaged. SsDNA formation does not require the DNA damage signaling kinases ATM and ATR, nor the exonucleases DNA2 and EXO1; however, ATM prevents excessive ssDNA accumulation when ARIA function is inhibited. These findings establish CHO cells as a powerful model to dissect telomeric RNA functions and reveal ARIA as a key regulator of telomeric repeat DNA integrity.
SONNEVILLE, R.; EVRIN, C.; WRIGHT, J. E.; XIA, Y.; LABIB, K. P. M.
Show abstract
Eukaryotic cells regulate the assembly and activation of the essential DNA helicase at the heart of the chromosome replication machinery, to ensure that the chromosomes are copied just once per cell cycle. The Mcm10 protein is essential for helicase activation in budding yeast, but an equivalent role for MCM10 orthologues in animal cells has not been explored. Moreover, complete deletion of the mcm-10 gene is viable in the nematode Caenorhabditis elegans, suggesting the involvement of additional factors. Here we show that MCM-10 and a second factor called SLD-2 are recruited to chromatin after helicase assembly in the C. elegans early embryo and are jointly required for helicase activation. Moreover, deletion of the Mcm10 gene is viable in mouse embryonic stem cells, but causes synthetic lethality in the absence of RECQL4, which is the orthologue of SLD-2 in vertebrate species. Helicase activation is blocked in the combined absence of MCM10 and RECQL4, mirroring the situation in C. elegans. These findings indicate that metazoan helicase activation requires two conserved factors that are mutated in human disease syndromes.
Fiorentino, J.; Monti, M.; Armaos, A.; Vrachnos, D. M.; Di Rienzo, L.; Tartaglia, G. G.
Show abstract
RNA-binding proteins (RBPs) regulate essential aspects of RNA metabolism, yet accurately identifying RNA-binding domains (RBDs) and quantifying the impact of sequence variation on RNA-binding ability remain challenging. Here, we present HERCULES (Hybrid framEwoRk for RNA-binding domain loCalization and mUtation anaLysis using physicochemical and languagE modelS), a unified sequence-based framework for simultaneous RBD localization, global RNA-binding propensity prediction and mutation effect assessment. HERCULES integrates a fine-tuned protein language model with an explicit residue-level physicochemical module, combining global contextual representations with local mutation-sensitive descriptors. On an independent test set, the HERCULES global score discriminates RBPs from non-RBPs with an AUROC of 0.86. At residue resolution, HERCULES outperforms state-of-the-art sequence-based predictors in identifying canonical, non-canonical and putative RBDs across Pfam-annotated proteins. Using a curated dataset of experimentally validated RNA-binding-disrupting mutations, HERCULES correctly classifies 87% of deleterious variants, including single-amino acid substitutions. Evaluation on experimentally resolved protein-RNA complexes further demonstrates robust residue-level performance and improved generalization when contact annotations are augmented with AlphaFold3-predicted complexes. By unifying domain localization and mutation sensitivity within a single sequence-only framework, HERCULES provides a mechanistically interpretable approach for studying RNA-protein interactions. HERCULES is freely available at https://tools.tartaglialab.com/hercules and as an open-source Python package at https://github.com/tartaglialabIIT/hercules.git.
Quadrini, M.; Tesei, L.
Show abstract
The ability to access, search, and analyse large collections of RNA molecules together with their secondary structure and evolutionary context is essential for comparative and phylogeny-driven studies. Although RNA secondary structure is known to be more conserved than primary sequence, no existing resource systematically associates individual RNA molecules with curated phylogenetic classifications. Here, we introduce PhyloRNA, a curated meta-database that provides large-scale access to RNA secondary structures collected from public resources or derived from experimentally resolved 3D structures. PhyloRNA allows users to search, select, and download extensive sets of RNA molecules in multiple textual formats, each entry being explicitly linked to phylogenetic annotations derived from five curated taxonomy systems. In addition to taxonomic information, each RNA molecule is accompanied by a rich set of descriptors, including pseudoknot order, genus, and three levels of structural abstraction--Core, Core Plus, and Shape--which facilitate comparative analyses across sets of molecules. PhyloRNA is publicly available at https://bdslab.unicam.it/phylorna/ and is regularly updated to incorporate newly available data and revised taxonomic annotations.
Mathis de Fromont, J.; Brosse, A.; Quenette, F.; Guillier, M.
Show abstract
Small regulatory RNAs (sRNAs) are major post-transcriptional regulators in bacteria and, together with transcriptional regulators such as the two-component systems (TCSs), participate in the rapid adaptation of these microorganisms to changing environments. Several examples of paralogous sRNAs with overlapping functions have been reported, that could in theory integrate different environmental cues. Consistent with this idea, we have identified the acid-responsive RstB-RstA two-component system, important for virulence of multiple bacterial species, as a specific multicopy activator of the Escherichia coli OmrB sRNA, but not of the paralogous sRNA OmrA. Further characterization of this regulation unexpectedly revealed the asr-ydgU operon, itself a target of RstB-RstA, as a dual modulator of this TCS via two opposite effects. First, the 27 aminoacids YdgU small protein exerts a negative feedback by directly interacting with RstB and, second, Asr in contrast mediates a positive feedback on RstB-RstA activity via a not completely elucidated mechanism. These results provide a new example of retro-control of a TCS, here RstB-RstA, by one of its direct targets. They further highlight the major role of small proteins in controlling TCS activity and ydgU was thus renamed samT, for Small Acid-responsive Modulator of the RstB-RstA TCS.
Roberson, A. B.; Marks, J.; Pitts, R.; Tamilselvam, B.; Grieb, B.; Tansey, W. P.; Meydan, S.
Show abstract
5-Azacytidine (5-AzaC) is a cytidine analog and is widely used to treat myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). Although its therapeutic activity is primarily attributed to hypomethylation resulting from DNA incorporation, the majority of 5-AzaC is incorporated into RNA. However, the functional consequences of 5-AzaC incorporation into RNA have been unknown. Here, we show that 5-AzaC treatment of cells leads to inhibition of protein synthesis. Ribo-seq, Disome-seq, and RNA-seq in cells treated with 5-AzaC exhibit a time-dependent C-to-G transversion signature in mRNAs within 2 h of treatment. These transversion events are enriched within footprint positions corresponding to the A-site of monosomes or leading stalled ribosome in a disome complex. Consistently, ribosome and disome footprints are accumulated at sites with C-rich codons in the A-site, specifically with the codons containing a C in the second position. 5-AzaC activates the integrated stress response (ISR) and the ribotoxic stress response (RSR) in a GCN2- and ZAK-dependent manner, consistent with disome-mediated signaling. Furthermore, loss of the Ribosome Quality Control (RQC) factor, ZNF598, sensitizes cells to 5-AzaC. Collectively, our results support a model where 5-AzaC is rapidly incorporated into mRNAs, disrupts decoding, and triggers disome-mediated signaling pathways, which contribute to its cytotoxicity. These findings suggest that translation disruption represents an additional layer of 5-AzaCs mechanism of action, alongside its known DNA-mediated effects.
Rapiejko, A. R.; Reddy, M.; Sacchettini, J. C.; Shell, S. S.
Show abstract
Regulation of RNA pools allows for adaptation to changing environments and stress, which is especially important in pathogenic bacteria such as Mycobacterium tuberculosis. RNA degradation is a significant contributor to RNA abundance, and Ribonuclease (RNase) E has a rate-limiting role in degradation of a majority of mycobacterial transcripts. However, many open questions remain about the RNA substrate requirements and specificities for efficient cleavage by mycobacterial RNase E. Here, using both Mycolicibacterium smegmatis and M. tuberculosis RNase E, we demonstrate that this enzyme is only active on substrates with a minimum length of approximately 27 nt. Furthermore, we show that mycobacterial RNase E prefers substrates with 5 monophosphates rather than 5 triphosphates, and that the positions of cleavage events within substrates are dictated by both sequence and distance from the RNA ends. Our results also suggest that RNase E may be affected by product inhibition. Finally, we show that M. smegmatis RNase E behaves similarly to M. tuberculosis RNase E, validating the use of this model organism for RNA degradation studies.
Munozvilla, J. H.; Ontiveros, A.; Mishanina, T. V.
Show abstract
Human mitochondrial genome (mtDNA) encodes multiple proteins in the oxidative phosphorylation complexes as well as the ribosomal and transfer RNAs (tRNAs) needed for in situ translation. These genes are transcribed from only three promoters, producing polycistronic transcripts that are co-transcriptionally cleaved by mitochondrial RNase enzymes to release majority of individual gene products. tRNAs separate many of these genes and are thought to serve as "punctuation" marks that enable RNase recognition, binding, and hydrolysis of the 5' "leader" and 3' "trailer" sequences flanking the tRNA. Mutations in the tRNA genes dominate the mtDNA-linked mitochondrial pathologies; yet a systematic study of the impact of tRNA sequence variation on the RNase-catalyzed processing is lacking. Here, we employed human mitochondrial tRNATyr as a model system to dissect the effect of tRNA variants on the in vitro 5' leader and 3' trailer hydrolysis. We found that nucleotide variations located near the catalytic interfaces - particularly within or near the tRNA acceptor stem - showed the strongest defects in 5' processing and prevented release of the downstream tRNA in a tRNA cluster where multiple tRNAs are transcribed in tandem. This work provides mechanistic insight into how mutations disrupt coordinated mitochondrial tRNA processing and establish a framework for predicting variant effects based on their structural position relative to the processing enzymes.
Xu, M.; Ireri, S. W.; Prator, M.; Lostroh, P.; Cao, M.
Show abstract
Bacteria can be engineered to express double-stranded RNA (dsRNA) that modulates eukaryotic host gene expression in a programmable manner via RNA interference (RNAi). This requires robust and systematic strategies for dsRNA circuit design and expression. Here, we developed modular genetic parts compatible with the CIDAR MoClo system for rapid assembly of dsRNA expression constructs in Escherichia coli HT115(DE3). We validated dsRNA production in vitro and assessed RNAi efficiency in Caenorhabditis elegans. A constitutive dsRNA circuit achieved rapid and near-complete gene knockdown, whereas a Ptac-driven circuit enabled tunable, partial silencing while minimizing the leakiness commonly observed in standard feeding RNAi systems. Together, this work expands the synthetic biology toolkit for dsRNA delivery, enabling precise control of RNAi outcomes from partial to complete gene silencing in nematodes.
Harris, F. E.; Hu, Y.; Verma, S.; Adhya, S.; Zhou, W.; Xiao, J.
Show abstract
Repetitive extragenic palindromes (REPs) are the most abundant repetitive noncoding elements in the E. coli genome. Despite their abundance, the primary function of REPs has remained unclear. At different times, REPs have been proposed to contribute to chromosome organization, mRNA decay regulation, and transcription termination, among other functions. Here, we show that the model REP, REP325, does not measurably compact the chromosome but instead acts as a 3UTR-associated transcription regulator within the yjdMN operon, functioning both as a partially Rho-dependent terminator that limits transcription into the downstream yjdN gene and as an mRNA stabilizer that protects the upstream yjdM transcript from degradation. This dual role in controlling both transcriptional readthrough and susceptibility to decay provides a framework that reconciles several previously conflicting observations about REP function. Our genome-wide RNA-seq analysis further reveals that REPs with more canonical sequence and hairpin structures are more often associated with upstream-biased expression in tandem gene pairs, and that REPs positioned between convergent genes correlate with elevated expression of both genes. The large variance in expression patterns in both gene pair configurations is consistent with context-dependent termination and degradation blocking. Similarly, REPs do not uniformly affect mRNA half-lives. Because REP locations vary between E. coli strains, REPs likely contribute to regulatory diversity by tuning gene expression without altering protein-coding sequences or promoter regions, opening new avenues for modulating gene expression through REP-mediated transcription regulation.
Cortot, M.; Stehlik, T.; Koch, A.; Schlemmer, T.
Show abstract
Efficient protein synthesis in eukaryotic cells typically requires a 5' cap structure on messenger RNAs (mRNAs). However, under stress conditions or in viral infection, translation can also occur independently of the cap via internal ribosomal entry sites (IRES). IRES elements are therefore key regulators of protein expression in both viral and cellular contexts. Here we describe a cell-free protocol to quantitatively assess IRES-mediated translation using wheat germ extract (WGE) and a firefly luciferase (FLuc) reporter. The protocol includes template preparation, RNA synthesis and luminescence measurement following in vitro translation in WGE. This method enables rapid and robust comparison of IRES activity under controlled conditions and can additionally be applied to evaluate mRNA modifications designed to enhance translation efficiency. Key featuresO_LIStringent in vitro workflow from DNA template preparation through RNA synthesis and protein synthesis to reporter readout, including quality controls. C_LIO_LIEvaluation of IRES-driven translation suitable for testing combinations of IRES and CDS. C_LIO_LItranslation analysis without radioactive labeling. C_LI Graphical overview O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=89 SRC="FIGDIR/small/716985v1_ufig1.gif" ALT="Figure 1"> View larger version (24K): org.highwire.dtl.DTLVardef@417649org.highwire.dtl.DTLVardef@1bcd186org.highwire.dtl.DTLVardef@15fecb3org.highwire.dtl.DTLVardef@acdf8d_HPS_FORMAT_FIGEXP M_FIG C_FIG Graphical AbstractPipeline for the production and evaluation of IRES-firefly luciferase constructs using wheat germ extract. (1-4) Preparation: IRES-firefly luciferase constructs are amplified in E. coli and isolated from bacterial cells. Plasmids are linearized to prepare for in vitro transcription. (5-6) Transcript synthesis and verification: In vitro transcription is followed by electrophoretic validation to confirm integrity and correct molecular weight. (7-8) Translation and detection: Translation is executed in wheat germ extract and quantified by measuring reporter activity in a luminometer.
Li, J.; Wang, J.; Dokholyan, N. V.
Show abstract
Due to the limited resolution of experimental data, many determined RNA structures contain physically implausible geometries, such as severe steric clashes and missing atoms. Resolving these defects during RNA structure refinement remains a fundamental challenge. Structure dictates the function, so the geometric accuracy of RNA structure is critical for understanding biological mechanisms. However, traditional algorithms for correction have limitations because of the complexity of RNA structures. We propose ChironRNA, an all-atom diffusion model with E(3)-equivariant graph neural networks to perform RNA refinement by resolving steric clashes and completing missing atoms. In ChironRNA, we adopt a hierarchical approach, including both an all-atom diffusion model and a coarse-grained diffusion model where each nucleotide is represented by a five-point representation. Our pipeline consists of two stages: a training stage and a generation stage. The diffusion model regenerates clashing nucleotide atoms step by step by removing the noise predicted by EGNN. ChironRNA achieves an 80% clash reduction on more than 80% of the test set. It performs better on structures of less than 200 nucleotides, resulting in a high percentage of cases having over 80% clash reduction rate and 100% atom reconstruction rate. Our results demonstrate that ChironRNA successfully resolves steric clashes and rebuilds missing atoms with high precision, offering a robust solution where traditional fine-tuning or enumerative approaches fail.
Miercke, S.; Schaubruch, K.; Maass, S.; Russeck, A. K.; Lawaetz, A. C.; Denham, E. L.; Heermann, R.; Mascher, T.
Show abstract
Survival of bacteria in their natural habitat requires dynamic responses and adaptation to environmental cues. In Bacillus subtilis, one adaptive strategy is cannibalism, a form of programmed cell death during post-exponential development. Cannibalism enhances multicellular differentiation by prolonging or preventing commitment to endospore formation under starvation conditions. B. subtilis produces three cannibalism toxins: the sporulation delay protein, the sporulation killing factor, and the epipeptide EPE. Production of the latter is encoded in the epeXEPAB operon. Expression of this operon is transcriptionally controlled by the stationary phase regulators Spo0A and AbrB. Here, we demonstrate that EPE production is also post-transcriptionally regulated by two RNA binding proteins, Kre and SpoVG. Deletion of comK, the master regulator of competence development, abolished EPE production. This defect was reversed by additionally deleting kre. The RNA-binding protein, Kre, binds the epeX transcript and acts as a bidirectional ComK repressor, indicating that ComK indirectly regulates EPE biosynthesis via Kre. A second RNA-binding protein, SpoVG, also binds to the epeX mRNA. While Kre acts as a negative regulator, SpoVG was essential for EPE production. These findings reveal a novel regulatory connection between competence and cannibalism, expanding our understanding of how programmed cell death is coordinated in B. subtilis. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=78 SRC="FIGDIR/small/716078v1_ufig1.gif" ALT="Figure 1"> View larger version (29K): org.highwire.dtl.DTLVardef@57e20dorg.highwire.dtl.DTLVardef@1b9f4e5org.highwire.dtl.DTLVardef@17cfbc9org.highwire.dtl.DTLVardef@76824d_HPS_FORMAT_FIGEXP M_FIG C_FIG
Luo, H.; Tang, D.; Zivanov, A.; Miskov-Zivanov, N.
Show abstract
Designing next-generation Chimeric Antigen Receptors (CARs) requires a systematic understanding of intracellular signaling domains and their downstream biological effects, yet no comprehensive knowledge resource currently exists for this purpose. Here, we present an automated workflow that integrates multiple natural language processing and large language model tools to extract biomolecular interactions from PubMed literature and assemble them into a CAR T cell signaling knowledge graph. Our pipeline combines REACH, INDRA, and Llama 3 across 15 targeted search queries, yielding a directed multi-relational graph of [~]7,500 unique interactions among [~]1,800 entities, including proteins, biological processes, and chemicals. We further demonstrate that queries incorporating biological process ontology terms retrieve more interaction-rich papers than protein-name-only searches, offering practical guidance for future literature mining efforts. The resulting knowledge base provides a structured foundation for predicting T cell phenotypes and prioritizing intracellular domain candidates for CAR design, with broader applicability to knowledge-driven inference in immunotherapy research.
Fukui, K.; Shibuya, A.; Murakawa, T.; Yano, T.
Show abstract
GHKL ATPases share a unique Bergerat ATP-binding fold and regulate diverse biological processes through ATP-dependent conformational changes. An early step of ATP hydrolysis in this family has been attributed to a single highly conserved glutamate residue proposed to function as the general base. However, mutations of this residue impair both the ATPase activity and ATP binding, complicating interpretation of its catalytic role. Re-examination of the high-resolution crystal structures revealed a second conserved acidic residue positioned within a hydrogen-bonding distance from the nucleophilic water molecule. Using Aquifex aeolicus MutL and GyrB as model enzymes, we combined systematic mutagenesis, ATPase and ATP-binding assays, and X-ray crystallography to dissect the roles of these residues. We show that alignment of the nucleophilic water can be maintained as long as the conserved glutamate retains hydrogen-bonding capability, whereas efficient ATP hydrolysis requires proton-accepting capacity at least one of the two acidic residues. These results indicate that the conserved glutamate primarily governs positioning of the nucleophilic water, while activation of this water for catalysis is achieved through cooperative general base function of the two acidic residues. Extending this framework to human MutL homologs, PMS2 and MLH1, we showed that clinically reported variants of uncertain significance in these DNA mismatch repair proteins substantially reduced the ATPase activity, indicating functional impairment. Together, our findings refine the catalytic mechanism of GHKL ATPases and provide a structural and functional framework for interpreting disease-associated variants in GHKL ATPases. Phylogenetic and ancestral state analysis further indicated that the second acidic residue was likely to be present in the common ancestor of major GHKL ATPase lineages but was later modified in a branch including Hsp90, suggesting evolutionary remodeling of the catalytic mechanism in the branch.
Raatz, R. C.; Hammerl, D. R.; Kornyushenko, A.; Graumann, P.
Show abstract
The restart of replication forks that have become stalled or disintegrated during the replication cycle is vital for all organisms, and in many bacterial species involves the conserved and essential DNA helicase PriA. PriA has been shown to physically interact with the C-terminus of SSB, which also binds to several other proteins involved in DNA repair and restart. It has been proposed that PriA is enriched at all replication forks in Bacillus subtilis via SSB interaction, such that it is instantly present to respond to a requirement for restart. Using single molecule tracking, we show that SSB and PriA are comprised of populations having very different diffusion constants, ruling out that PriA is co-migrating with fork-bound SSB. Indeed, PriA was only enriched at a subset of cells in exponentially growing cells, dependent on the C-terminus of SSB, but largely showed confined motion through the entire genome, searching for target sites in a transcription factor-like manner. Upon stalling of forks, SSB became highly enriched in all cells, suggesting a first line of response. PriA was also visibly enriched at forks following replication stress, in contrast to primosome proteins DnaD and DnaI, who showed only moderate changes in localization or in single molecule motion. PriA dwell times were affected by the lack of the SSB C-terminus, and also by the absence of RecG helicase, which is involved in recombination events. Heterogeneity of restart proteins at replication forks also extends to translesion DNA polymerases PolY1 and PolY2. Both proteins are low-abundant such that a considerable fraction of cells is devoid of any molecule. Our findings show that SSB accumulation is an initial response to replication stress, and that translesion synthesis and lesion skipping are less frequent events than fork remodelling.
Horecka, I.; Usaj, M.; Masinas, M. P. D.; Ward, H. N.; Zhang, X.; Hassan, A. Z.; Billmann, M.; Rost, H.; Myers, C. L.; Costanzo, M.; Andrews, B. J.; Boone, C.
Show abstract
Genetic interaction networks map functional connections between genes and their corresponding pathways and complexes. We previously developed TheCellMap.org as a central repository for storing and analyzing quantitative genetic interaction data produced by genome-scale Synthetic Genetic Array (SGA) analysis in the budding yeast, Saccharomyces cerevisiae. We have expanded TheCellMap.org to include ~89,000 quantitative genetic interactions identified from genome-scale CRISPR-based analysis of ~4 million human gene pairs in the haploid cell line, HAP1. TheCellMap.org enables users to readily access, visualize and explore human HAP1 genetic interactions, as well as to extract and reorganize sub-networks, applying data-driven network layouts in an intuitive and interactive manner.
Finkel, J. M.; Williams, M. G.; Nirmal, M. B.; Pandey, S.; Howe, E. D.; Liu, C. T.; Lohman, J. R.; Sharma, N.; Vo, T. V.
Show abstract
Background/ObjectivesRNA polymerase II is a multifunctional complex that is critical for gene regulation and environmental responses. Its POLR2I subunit in human is associated with various pathologies, including cancer chemoresistance. However, much of our understanding of how POLR2I could function indirectly derives from studies of its homologs in yeasts called Rpb9. Here, we endogenously humanized the rpb9 gene of the fission yeast Schizosaccharomyces pombe to examine the functional capabilities of POLR2I. MethodsWe edited the genomic rpb9 locus in S. pombe so that it encodes the human POLR2I protein, and investigated functional and structural conservation. ResultsWith our humanized yeast system, we find widespread functional complementation by human POLR2I of S. pombe rpb9 roles in yeast growth, chronological aging, and stress responses. We also find that POLR2I complements novel roles for yeast rpb9 in facultative heterochromatin assembly, resistance against the chemotherapy 5-fluorouracil, and resistance against the fungicide thiabendazole. In contrast, we find that POLR2I cannot complement the role of rpb9 in resistance against the transcription elongation inhibitor 6-azauracil (6-AU) in our system. Interestingly, POLR2I could complement 6-AU resistance if ectopically expressed. Lastly, we observe extensive structural homology between Rpb9 and POLR2I proteins. ConclusionsOur study establishes an endogenous cross-species gene complementation strategy that uncovers both conserved and rewired functions of fission yeast rpb9 and its human homolog, POLR2I. In addition to validating conserved roles, we also identified conservation of previously unrecognized roles of rpb9 in heterochromatin formation and chemoresistance.
Song, K. S.; Cyr, M.; Faucher-Giguere, L.; Yeo, B.; Seow, V. K.; Deschamps-Francoeur, G.; Abou Elela, S.; Scott, M. S.
Show abstract
Small nucleolar RNAs (snoRNAs) are canonically viewed as stable components of ribonucleoprotein complexes dedicated to RNA modification. Here, we developed snoFlake, a snoRNA-centric interaction network integrating physical and functional associations between box C/D snoRNAs and RNA-binding proteins (RBPs), challenging this narrow view. Using snoFlake, we systematically identified snoRNAs predicted to form noncanonical complexes with diverse RBPs, extending their roles into post-transcriptional regulation. We found 23 high-confidence network motifs enriched for RNA-processing functions, including a top-ranked module linking SNORD22 to U5 snRNP components PRPF8 and EFTUD2. SNORD22 co-binds with these spliceosomal RBPs at splice sites showing reduced U5 snRNP occupancy, suggesting a role in reinforcing spliceosomal engagement at suboptimal exons. Consistently, SNORD22 depletion promotes exclusion of weak cassette exons, altering transcript isoform composition and predicted coding output. Beyond SNORD22, snoFlake reveals snoRNAs with similar network profiles, providing a resource for uncovering previously uncharacterized snoRNA-RBP complexes and expanding the functional snoRNome.